F/«  S/« 


MlNNCSOTA  UNXV  MIMNEAPOLXS  OCPT  OT  PSYCHOLMY 
accuracy  op  PCRCeXVCD  TC8T-XTW  OXFFXCULYXES. (U) 
may  77  J S PRCSYVOODf  D J WCXSS 

M.77*S  


NOOOIA-76-C-OMS 

ML 


MICROCOPY  RESOLUTION  TESI  CHART 

NATl()NAL  BUf<[All  I'f  STANDARDS  i9b.^.T 


r 


/ 


00 


/ 


ACCURACY  OF  PERCEIVED 
TEST-ITEM  DIFFICULTIES 


; f 


j.  Stephen  Prestwood 
and 

David  ).  Weiss 


D D C 

JUN  30  1977 

A 


RESEARCH  REPORT  77-3 
MAY  1977 


Psychometric  Methods  Program 
Department  of  Psychology 
University  of  Minnesota 
Minneapolis,  MN  55455 


% 

¥ 

V 

F 


o 

CD 


I ^ 


Prepared  under  contract  No.  N00014-76-C-0243 , NR150-382 
with  the  Personnel  and  Training  Research  Programs 
Psychological  Sciences  Division 
Office  of  Naval  Research 

Approved  for  public  release;  distribution  unlimited. 
Reproduction  in  whole  or  in  part  is  permitted  for 
any  purpose  of  the  United  States  Government. 


_ 1 

Inclassified 

StCUKI  1 V Cl  ASM*  (CATION  OK  ^ Ml’.  **  AG  f.  fHAn  I)aI»  rntr>r*d) 

REPOrJV  CyCli.'.'.L-UTATiO,^  PAGE 


f»L»*OHT  HUMUfcX 

Research  Report  77-3 


4.  title  Subllllf) 


Accuracy  of  Perceived  Test-Item  Difficulties  ^ 


. OOVT  »vCCLbSl<;>H  hO.I  3 RECieiENT'S  C »T  ALOO  mikU.LR 


y hype  wepoPT  • pemoo  coverco 

/Technical  Report 


RERrOI. KINO  OHO.  Rff'OHT  NUMBER 


7.  AUTK0HC»> 

d.  Stephen  ^restwood  and  David  J. /Weiss 


. co.-n  naut  OR  orant  NUfciULRf*; 
N00014-76-C-0243  ^ 


9.  PLRFORKINO  CROAM2ATION  N/Mt  AND  ADDRUS  '0.  RI..)0l'AM  f L t .1  C N T.  PHO  J t C T , T A^K 

AREA  0 WORK  UNIT  UUMutRS 

Department  of  Psychology  v P.E.:61153N  PROJ. ;RR042-04 

University  of  Minnesota  T. A. : RR042-04-01 

Minneapolis,  Minnesota  55433 ii  ^ :\R  I 30-382 

11.  CONT  rolling  OFFICE  NAME  AND  ADDRESS  , U.  HEI'ORTOATE 

Personnel  and  Training  Research  Programs  . , May  197  7 

Office  of  Naval  Research  it.  number  onf  aces  ^ 

Arlington,  Virginia  2 2217 25 / - ■ . :! . 

MONITOKINO  agency  UAMt  & dt/forant  {rotn  Contn/lUn^  Ot/tc6)  >5.  SECURITY  CLASS,  (ot  fhi^ 

) /■ 

•/  Unclassified 

■ 'Vi A.  b EClTASsTFrc A Tl OIP  DO« N G RaIJTn (T 

I y/  SCHEDULE 


U.  HEI'ORT  DATE 

, May  1977 

IT.  NUMBER  OPTACES  / ' 


16.  DISTHIDLUON  STATEMENT  fo/  rA/«  K«po,0  " 

Approved  for  public  release;  distribution  unlimited.  Reproduction  in  wnole 
or  in  part  is  permitted  for  any  purpose  of  the  United  States  Government. 


17.  DISTRIBUTION  statement  (of  £b»trsct  tnltrtd  In  Olock  20,  It  dlUvrent  Itom  ti»pvrl) 


19.  KEY  WORDS  (ConUnu*  on  fevar#e  •lo*  il  nec9»9»Ty  estd  Idtniity  by  block  njtrbtt) 


testing 

ability  testing 
computerized  testing 
adaptive  testing 


sequential  testing 
branched  testing 
individualized  testing 
t.allored  testing 


programmed  testing 
response-cont ingent  test ing 
automated  testing 


20.  ABST^RACT  (Conllnum  on  r«v«r«»  •/*/•  //  and  Identity  by  block  numbor) 

■~“This  study  investigated  the  accuracy  with  which  testees  perceive  the 
difficulty  of  abi I ity-te.st  items.  Two  41-itcm  conventional  tests  of  verbal 
cability  were  constructed  for  administration  to  testees  in  two  ability  groups. 
Testees  in  both  the  high-  and  low-abilitv  groups  responded  to  each  multi- 
ple-choice item  by  choosing  tlie  correct  alternative  and  then  rating  the  item’s 
difficulty  relative  to  their  levels  of  ability.  Least-squares  estimate.s  of 
item  difficulty,  which  were  based  solely  on  the  difficulty  ratings,  corre- 
lated highly  with  proportion-correct  and  latent  trait  estimates  of  item  . 


DD  , ]T77  1473 


rOlTlON  Of  I NOV  «9  IS  OOSOLETR 
S/N  PlOl-OH-  6601  I 


SECur.ITT  CL  ASbiriCATION  OF  THIS  PAGE  fPA«n  t'«'»  r.Al«»U) 


Unclassif ied 


r.i  "'in.nY  i.L  .•.•,-■11  II  ^ri'  N 


I .11  . ' ' 


5^clifficulty  bas^d  on_a_  ixii™XiLi.  .samp]  Least -squares  e.st  itaate.s  of  testee 

ability,  whicii  were  based  solely  on  the  difficulty  perceptions  of  the  testees, 
correlated  significantly  with  number-correct  and  maximum-likelihood  ability 
scores  based  on  the  testees'  conventional  responses  to  the  items.  These 
■^^■suits  show  tliat  item-difficulty  perceptions  were  highly  related  to  the 
Objective**  indices  of  item  difficulty  often  used  in  test  construction,  and 
that  as  testee  ability  level  increased,  the  items  were  perceived  as  being 
relatively  less  difficult..  The  relationship  between  a testee's  ability  and 
his/her  perception  of  an  ifi^vidual  item's  relative  difficulty  appeared  to  he 
weak.  Of  major  importance  \ts  the  finding  that  ite;ns  which  were  appropriate 
in  difficulty  levels  from  a i\:qv'i.'hoiiiet  rit:  standpoint  were  perceived  by  the 
testees  as  being  too  difficult  for  their  ability  Kvcls.  The  effects  on 
testees  of  tailoring  a test  sinV  that  items  are  perceived  as  being  uniformly 
too  difficult  should  be  investi.?,^it.,d. 


Unc 1 ass i f ied 

t • « i >'  ■>  I » . '>  * M.'  I I*  ,*  • » ( M r\*,r  . 


CONTENTS 


Introduction  1 

Method  2 

Test  Construction  2 

Procedure  3 

Subjects  3 

Test  administration  3 

Design  4 

Accuracy  of  Difficulty  Perceptions  4 

Method  of  Analysis  4 

Difficulty  perception  model  4 

Accuracy  of  rat ings-based  estimates  5 

Dimensionality  of  difficulty  perceptions  6 

Results  6 

Dimensionality  of  difficulty  perceptions  6 

Accuracy  of  ratings-based  estimates  8 

Difficulty  Perceptions  of  Individual  Items  8 

Method  of  Analysis  12 

Results  12 

Perceptions  of  Appropriate  Item  Difficulty  12 

Method  of  Analysis  12 

Results  13 

Discussion  13 

Conclusions  15 

References  16 

Appendix  A:  Item  Calibration  Procedure  18 

Appendix  B:  Supplementary  Tables  24 


Technical  editing  by  Terryl  Graham 


Accuracy  of  Perceived  Test-Item  Difficulties 


r 


I 


K 


4* 


i 

i 

k 


i 

i 

k 

*• 

It 

>• 

V 

r 


Conventional  ability  tests  require  all  testees  to  answer  the  same  set  of 
test  items.  Because  testees  differ  in  ability  level,  however,  tests  of  this 
kind  may  potentially  create  differential  psychological  environments  for  testees 
of  different  ability  levels.  A test  which  is  appropriately  difficult  for  a 
testee  of  average  ability  may  be  perceived  by  less  able  individuals  as  being 
much  too  difficult,  and  such  perceptions  may  lead  these  testees  to  approach  the 
task  with  anxiety  and  forbearance.  On  the  other  hand,  individuals  with  higher 
than  average  abilities  may  find  the  task  a simple  or  even  pleasant  one. 

Clearly,  the  psychological  environment  of  a testee  may  vary  greatly  depending 
on  the  individual's  perception  of  the  task. 

Adaptive  tests  are  designed  such  that  each  testee  receives  items  which  are 
psychometrically  appropriate  for  his/her  ability  level  (Lord,  1970;  Weiss,  1974; 
Weiss  & Betz,  1973).  For  example,  items  in  such  tests  may  be  chosen  so  that 
each  testee,  regardless  of  ability  level,  will  have  approximately  a fifty-percent 
chance  of  answering  the  item  correctly  (e.g..  Lord,  1970).  The  adaptive  test 
may  thus  reduce  the  differential  psychological  environments  arising  from  the 
administration  of  a fixed  set  of  items  to  persons  of  differing  ability  levels, 
and  may  thereby  improve  the  performance  of  low-ability  students.  In  fact, 
under  certain  conditions,  adaptive  testing  has  been  shown  to  be  more  motivating 
for  low-ability  testees  (Betz  & Weiss,  1976.b)  and  to  result  in  higher  ability 
estimates  (Betz  & Weiss,  1976/). 

Holtzman  (1970)  points  out  the  potential  importance  of  psychological  factors 
in  the  estimation  of  an  Individual's  ability: 

It  may  be  important  to  investigate  the  interaction  of  personality 
and  situational  factors  with  tailored  testing.  The  motivational  impact 
on  the  student  when  he  discovers  that  most  of  the  items  are  at  a certain 
level  of  difficulty  (or  uncertainty)  is  unknown.  The  optimal  level 
(or  mixture  of  levels)  for  a given  student  will  not  be  derived  from  test 
theory  alone;  information  about  student  anxiety  and  motivation  may  also 
be  relevant,  (p.  199). 

Whether  adaptive  tests  can  actually  reduce  the  differential  psychological 
effects  due  to  the  administration  of  an  inappropriately  easy  or  difficult  set 
of  test  items  depends  largely  on  whether  testees  can  accurately  perceive  the 
difficulties  of  the  items  administered.  Little  research  has  dealt  directly 
with  the  question  of  item-difficulty  perception. 

Munz  and  Jacobs  (1971)  asked  introductory  psychology  students  to  scale 
multiple-choice  examination  questions  on  the  subjective  difficulty  an  introduc- 
tory psychology  student  would  experience  in  reaching  a solution  to  a particular 
test  question.  Thurstone's  methods  of  equal-appearing  intervals  was  used  to 
derive  difficulty  scale  values  for  the  individual  items.  These  scale  values 
correlated  positively  but  moderately  (,*'=.52)  with  traditional  proportion-correct 
difficulty  Indices  based  on  the  subsequent  administration  of  those  items  to 


other  introductory  psychology  students.  However,  Munz  and  Jacobs  made  no 
attempt  to  determine  the  accuracy  with  which  individuals  perceived  item  diffi- 
culties velative  to  their  own  levels  of  ability.  Further,  these  results  may 
be  generalized  only  to  other  achievement-testing  situations  where  students  have 
been  exposed  to  the  material  and  have  made  an  attempt  to  familiarize  themselves 
with  it. 

Bratfisch,  Domic,  and  Borg  (1972)  asked  individuals  to  estimate  the 
subjective  difficulty  of  items  from  sets  A,  B,  D,  and  E of  Raven's  Etandavrl 
Pi‘Oijrejsive  Mat'nees.  The  items  were  first  administered  conventionally,  in  the 
order  of  their  "objective"  difficulty  as  assessed  by  deteirminlng  the  proportion 
of  correct  responses  in  a norming  sample.  Following  this,  the  items  were 
presented  in  random  order  and  estimates  of  their  subjective  difficulties  were 
obtained  through  a magnitude  estimation  procedure.  The  Spearman  rank-order 
correlation  between  the  subjective  difficulties  of  the  items  and  the  order  of 
their  Initial  administration  (i.e.,  their  ranked  "objective"  difficulty)  was 
positive  and  high  (p^=.90).  Unfortunately,  the  effect  of  the  items'  prior 

administration  in  the  order  of  their  objective  difficulty  cannot  be  determined. 

In  another  study  by  the  same  authors  (Bratfisch,  Borg  & Domic,  1972), 
testees  were  administered  numerical-reasoning,  spatial-ability,  or  verbal- 
comprehension  items  in  the  order  of  "objective"  difficulty  of  the  items  in  the 
tests.  Immediately  after  attempting  to  answer  each  item  in  the  conventional 
manner,  the  testees  rated  the  item's  difficulty  on  a nine-point  scale  where 
1 corresponded  to  a "very,  very  easy"  item  and  9 corresponded  to  a "very,  very 
hard"  item.  The  Spearman  correlations  between  order  of  administration  and 
perceived  difficulty  for  the  numerical-reasoning,  spatial-ability,  and  verbal- 
comprehension  tests  were  .97,  .92,  and  .92,  respectively.  Unfortunately,  in 
both  studies  by  these  authors,  the  subjective  difficulties  were  not  explicitly 
related  to  the  testees'  perceptions  of  an  item's  appropriateness  to  their 
ability  levels.  More  importantly,  in  both  studies,  it  is  impossible  to  separate 
the  effect  of  item  difficulty  from  that  of  order  of  administration. 

The  present  study  was  designed  to  determine  whether  or  not  testees  can 
perceive  the  difficulties  of  ability  test  items  relative  to  their  levels  of 
ability  and,  if  so,  to  investigate  the  accuracy  of  these  perceptions  for 
individual  items.  Additionally,  the  study  was  designed  to  determine  the  level 
of  item  difficulty  perceived  by  testees  as  being  appropriate  for  their  ability. 


Method 


Test  Construction 


Two  41-item  conventional  tests  were  designed  v;hich  liad  a large  range  of 
differences  between  the  difficulties  of  successive  items.  Items  for  the  tests 
were  chosen  from  a pool  of  five-alternative,  multiple-choice  vocabulary  items 
on  the  basis  of  their  normal-ogive  difficulty  (fc)  and  discrimination  (a) 
parameters  (Lord  & Novlck,  1968).  One  of  the  tests  was  designed  to  be  adminis- 
tered to  a group  of  relatively  low-ability  college  students.  The  other  test 
was  designed  to  be  administered  to  a group  of  relatively  higher  ability  students. 


-3- 


The  item  parameter  estimates  were  based  initially  on  data  reported  by 
McBride  and  Weiss  (1974),  derived  from  samples  of  University  of  Minnesota 
undergraduates.  These  parameter  estimates  were  revised  using  a prpcedure 
essentially  the  same  as  that  described  by  Jensema  (1976) . Appendix  A 
describes  the  process  of  developing  the  revised  item  parameters.  The  difficulty 
and  discrimination  parameters  for  each  test  item  are  shown  in  Appendix  Table  B-1. 

_ The  low-  and  high-ability  tests  had  a mean  difficulty  of  fc=-2.190  and 
7>=-.488,  respectively.  Mean  discrimination  values  for  the  low-  and  hlgh-abllity 
tests  were  a'=1.117  and  a=1.501,  respectively. 

Procedure 


Subjects . Two  groups  of  undergraduate  students  participated  in  this  study. 
The  first  group  consisted  of  119  students  from  psychology  classes  in  the 
University  of  Minnesota's  General  College  (GC)  who  were  tested  in  the  winter 
of  1975.  The  second  group,  tested  in  the  spring  of  1975,  consisted  of  185 
students  from  an  introductory  psychology  class  in  the  University's  College  of 
Liberal  Arts  (CLA) . All  students  were  volunteers  who  received  points  toward 
their  final  course  grades  for  participation  in  the  experiment.  GC  students 
typically  perform  more  poorly  on  ability  and  aptitude  tests  than  do  CLA 
students;  for  the  purposes  of  this  study,  the  GC  students  will  therefore  be 
designated  as  the  "low-ability"  group  while  the  CLA  students  will  be  referred 
to  as  the  "high-ability"  group. 

Test  administration.  All  students  were  tested  at  individual  cathode-ray 
terminals  (CRTs)  connected  to  a Hewlett-Packard  9600E  real-time  computer  system. 
Instructional  screens  similar  to  those  described  by  DeWitt  and  Weiss  (1974, 
pp.  36-53)  explained  the  operation  of  the  CRTs  before  the  actual  testing  was 
begun.  In  addition,  a proctor  was  present  in  the  testing  room  to  provide 
assistance  in  the  operation  of  the  equipment. 

Each  student  answered  41  multiple-choice  vocabulary  test  items.  The 
first  six  test  items  presented  were  identical  for  testees  in  a given  ability 
group.  These  items,  whose  difficulties  reflected  the  difficulty  range  of  the 
test,  served  to  familiarize  the  students  with  the  range  of  difficulties  they 
would  subsequently  encounter.  The  remaining  35  items  in  each  test  were  presented 
in  four  different  orders  of  administration  to  minimize  the  effect  that  the  order 
of  item  presentation  might  have  on  perceived  item  difficulty.  Testees  were 
sequentially  assigned  to  one  of  the  four  conditions.  Although  the  same 
procedure  was  followed  in  both  ability  groups,  the  items  differed  between 
groups.  Appendix  Table  B-1  shows  the  order  of  item  administration  in  each  of 
the  four  conditions  for  each  ability  group. 

Prior  to  the  administration  of  the  test,  the  students  were  informed  that 
they  would  have  as  much  time  as  they  needed  to  complete  the  task.  During  the 
test,  items  were  presented  on  the  CRT  screen  and  students  responded  by  typing 
the  number  corresponding  to  the  chosen  alternative  for  each  five-alternative 
multiple-choice  item.  Immediately  after  responding  to  an  item,  each  student 
was  asked  to  indicate  the  Item's  perceived  difficulty  by  entering  a difficulty 
code  selected  from  the  following  list: 


-4- 


A.  Much  too  easy  for  you 

B.  Somewhat  too  easy  for  you 

C.  Just  about  right  for  you 
P.  Somewhat  too  hard  for  you 
E.  Much  too  hard  for  you. 

The  testee's  response  was  then  checked  by  the  computer  to  ensure  that  one  of 
the  five  alternatives  had  been  chosen,  and  these  data  were  stored  with  the 
item-response  data  for  later  analysis. 

Design 

The  study  was  designed  to  investigate  three  different  aspects  of  item- 
difficulty  perception.  The  initial  phase  was  designed  to  determine  whether  or 
not  testees  could  accurately  perceive  the  difficulty  of  ability-test  items. 

The  second  phase  was  concerned  with  whether  or  not  a testee's  ability  level 
was  related  to  the  perception  of  the  relative  difficulty  of  a given  item; 
that  is,  how  accurate  an  individual's  perceptions  were,  relative  to  his/her 
ability  level.  The  third  phase  of  the  analysis  attempted  to  determine  the 
relative  item  difficulty  which  was  perceived  by  the  testee  as  being  about 
right  for  his/her  ability  level. 


Accuracy  of  Difficulty  Perceptions 


Method  of  Analysis 


Difficulty  perception  model}  An  individual's  perception  of  an  item's 
difficulty  can  be  thought  of  as  the  signed  distance  between  the  person's 
ability  level  and  the  Item's  difficulty  level  in  a Euclidean  ability/difficulty 
space.  This  perception  will  be  denoted  by 

P 

d ■ 1 w . {X  . -X . ) [1] 

JP  OP  ^P 

where  d.  . is  the  perceived  difficulty  of  item  j for  person  i 

X.  is  the  difficulty  of  item  j along  ability/difficulty  dimension  p 

JP 

X.  is  the  ability  of  person  i along  ability/difficulty  dimension  p 

tp 

W is  the  weight  of  item  j along  dimension  p 

P is  the  number  of  dimensions  in  the  ability/difficulty  space. 

Thus,  in  this  model,  the  difficulty  of  an  item  for  a given  person  is  defined 
as  the  weighted  sum  of  the  signed  distances  between  the  location  of  the  item 
and  the  location  of  the  person  along  P ability/difficulty  dimensions.  For  the 
present  analysis,  numerical  values  of  d..  were  assigned  to  each  alternative  on 


^Appreciation  for  the  development  of  this  model  is  expressed  to  Mark  Davison, 
Assistant  Professor  of  Educational  Psychology,  University  of  Minnesota. 


r 


-5- 


the  rating  scale.  The  values  assigned  to  alternatives  A through  E were  -2, 

-1,  0,  +1,  and  +2,  respectively.  Thus,  d..  increased  as  the  perceived  difficulty 

of  an  item  increased,  and  d.  . was  equal  to  zero  when  an  item  was  perceived  by 

a testee  as  "just  about  right  for  [me]." 

The  use  of  a model  such  as  that  in  Equation  1 is  advantageous  for  several 
reasons.  Using  the  difficulty  ratings  alone,  estimates  of  individual  ability 
levels  and  item  difficulties  can  be  derived  on  a common  metric.  In  addition, 
the  general,  multidimensional  form  of  the  model  may  be  particularly  useful  in 
describing  difficulty  perceptions  on  multi-ability  test  batteries  or  other 
such  multi-trait  instruments. 


Note  that  P in  the  model  corresponds  to  the  number  of  dimensions  in  the 
space.  If  the  item  difficulty  ratings  are  unidimensional,  P will  equal  1 and 
d^  . can  be  expressed  more  simply  as 


d.  . = w .(X  .-X .) . [ 2 ] 

K r J I- 

Further,  if  the  items  are  assigned  unit  weights,  the  expression  in  Equation  2 
becomes 


d.  .=  (x  .-X .) . 


[3] 


If  the  model  and  the  assumption  of  unidimensionality  are  appropriate  and 
the  average  ability  level  within  a group  of  testees  is  arbitrarily  set  at  zero, 
a least  squares  estimate  of  a single  item's  difficulty  (x  is  found  to  be 


i 


t=i 


[4] 


where  N is  the  number  of  persons  rating  the  item.  Thus,  an  estimate  of  an 
item's  difficulty  is  simply  the  average  difficulty  rating  assigned  to  that 
item  by  the  individual  being  tested. 


Similarly,  a least  squares  estimate  of  j:.,  the  ability  level  of  person  i,  is 


2 2 "" 
X.  = - ~ Z d.  . + - Z 

" "j=2  ^j=2 


X . 


[5] 


where  n is  the  number  of  items  adminstered.  An  estimate  of  an  Individual's 
ability  level  is  thus  the  average  difficulty  rating  he/she  assigns  to  a set  of 
items  plus  the  average  item-difficulty  in  that  set. 


Accuracy  of  ratings-based  estimates.  The  estimates  of  item  difficulties 
and  individual  ability  levels  described  by  Equations  4 and  5 are  based  solely 
on  the  testees'  ratings  of  relative  item  difficulties.  In  order  to  determine 
the  appropriateness  or  accuracy  of  these  perceptions,  the  ratings-based  esti- 
mates of  item  difficulties  and  students'  abilities  were  compared  to  more  conven- 
tional estimates  based  on  the  correctness/incorrectness  of  the  testees'  conven- 
tional responses  to  the  test  items. 


r 


The  rat ings-based  estimates  of  item  difficulty  were  correlated  with  the 
proportion  of  persons  in  the  present  study  identifying  the  correct  response 
alternative  and  also  with  the  normal-ogive  estimates  of  item  difficulty  {h  .) 

based  on  the  item-calibration  described  in  Appendix  A.  The  ratings-based 
estimates  of  student  ability  were  correlated  with  traditional  number-correct 
scores  and  maximum-likelihood  ability  estimates  (Betz  & Weiss,  1976a) 
based  on  the  normal-ogive  parameters  of  the  items. 


i 


I 

U . 


Dimensionality  of  difficulty  perceptions.  In  order  to  use  the  simple,  uni- 
dimensional form  of  the  difficulty-perception  model  described  above,  the  uni- 
dimensionality of  the  difficulty  ratings  must  be  demonstrated.  Because  there  is 
no  definitive  test  of  unidimensionality,  an  indirect  evaluation  was  necessary. 

McBride  and  Weiss  (1974)  suggested  four  criteria  which,  if  met,  constitute  sufficient 
evidence  of  unidimensionality  in  item-response  data.  According  to  the  criteria 
suggested,  confirmatory  evidence  of  unidimensionality  is  present  when:  2)  the  first 
common  factor  of  the  matrix  of  inter-item  correlations  is  a general  factor  account- 
ing for  a large  proportion  of  the  common  variance  and  on  which  all  variables  load 
highly;  2)  the  second  and  subsequent  factors  account  for  much  smaller  and 
essentially  equal  proportions  of  the  common  variance;  3)  the  item  loadings  on 
the  first  factor  are  either  all  positive  or  all  negative;  and  4)  none  of  the  above 
criteria  are  satisfied  by  the  analysis  of  a similar  correlation  matrix  constructed 
from  computer-generated  random  data.  Although  these  criteria  were  suggested  in 
the  context  of  the  analysis  of  item-response  data,  they  are  equally  applicable 
to  the  analysis  of  the  difficulty  ratings. 

Accordingly,  a 41x41  matrix  of  product-moment  inter-item  correlations 
among  the  difficulty  ratings  was  factor  analyzed  for  each  ability  group. 

Communalltles  for  each  item  were  estimated  by  the  squared  multiple  correlation 
of  that  item  with  all  others  in  the  matrix.  Factors  were  extracted  by  the 
principal  axes  procedure  and  the  resulting  communalltles  were  substituted 
for  the  prior  communality  estimates.  This  procedure  continued  in  an  Iterative 
fashion  until  the  differences  between  the  two  communality  estimates  were 
negl iglble. 

Results 


Dimensionality  of  difficulty  perceptions.  Evidence  of  the  dimensionality 
of  the  difficulty  ratings  is  shown  in  Figures  la  and  lb.  These  figures  show 
the  first  ten  eigenvalues  of  the  inter-item  correlation  matrix  based  on  the 
difficulty  ratings  for  the  low-  and  high-ability  groups,  respectively.  In  both 
figures,  the  eigenvalues  from  the  analysis  of  the  ratings  are  represented  by  a 
solid  line,  while  the  dashed  line  shows  those  resulting  from  an  analysis  of 
comparable,  computer-generated  random  data. 

In  both  ability  groups,  the  first  factor  of  the  real  data  extracted  by 
far  the  largest  amount  of  variance,  while  the  second  factor  extracted  only 
slightly  more  variance  than  did  subsequent  factors.  The  first  factors  extracted 
from  the  random  data,  on  the  other  hand,  accounted  for  little  more  variance 
than  other  random-data  factors.  The  amount  of  variance  extracted  by  the  second 
and  subsequent  factors  in  the  real  data  was  similar  to  that  extracted  by  the 
second  and  subsequent  factors  in  the  random  data. 


Contribution 


Table  1 lists  the  loadings  of  the  items  from  each  test  on  the  first  three 
factors  extracted  from  the  matrix  of  inter-item  correlations  of  difficulty 
ratings  for  that  test.  Each  of  the  items  loaded  positively  on  the  first 
factor  from  that  test's  data,  and  the  first  factor  loadings  were  generally  high. 
These  data  therefore  suggest  the  existence  of  a "general"  factor.  Also  shown 
in  Table  1 are  the  loadings  for  the  first  three  factors  from  the  comparable 
random  data  for  each  group.  For  these  latter  data,  the  first  factor  was 
bipolar  for  both  groups;  i.e.,  positive  and  negative  loadings  occurred  as 
frequently  on  the  first  factor  as  on  Factors  2 and  3.  In  the  real  data,  such 
bipolarity  occurred  only  on  the  second  and  subsequent  factors.  These  results 
therefore  suggest  that  for  both  ability  groups,  the  difficulty  ratings  may 
be  characterized  as  being  unidimensional. 


Accuracy  of  ratings-based  estimates.  Because  the  difficulty  perceptions 
appeared  to  be  unidimensional,  the  difficulty  ratings  were  used  in  conjunction 
with  Equations  4 and  5 to  calculate  ratings-based  estimates  of  item  difficulty 

(.X .)  and  testee  ability  (x .) . The  estimates  of  item  difficulties,  based  solely 
J 


on  the  difficulty  ratings,  are  shown  in  Table  2.  Table  ? alsc  shows  proportion 
correct  (p  .)  and  normal-ogive  (b  .)  item-difficulty  estimates  for  each  item. 

tj  tJ 


In  the  low-ability  group,  estimates  of  item  difficulty  derived  from  the 
difficulty  perceptions  were  highly  related  to  proportion-correct  and  normal- 
ogive  item-difficulty  estimates;  Pearson  product-moment  correlations  were 
p=-.86  and  r>=.80,  respectively.  The  relationships  between  the  ratings-based 
difficulty  estimates  and  the  estimates  based  on  conventional  responses  to  the 
items  were  similarly  high  for  items  in  the  high-ability  group,  with  respective 
Pearson  product-moment  correlations  of  r=-.94  and  r=.85. 

Appendix  Table  B-2  shows,  for  each  testee,  number-correct  scores  (n  .) 

and  maximum  likelihood  estimates  of  the  testee's  ability  level  (6^)  based  on 

his/her  conventional  responses  to  the  items  and  the  corresponding  ability 
estimates  based  on  the  difficulty  perceptions  ■ The  Pearson  product-moment 

correlations  of  the  ratings-based  ability  estimates  with  the  corresponding 
number-correct  scores  and  with  maximum  likelihood  ability  estimates  were 
r=.55  and  r=.56,  respectively,  for  testees  in  the  low-ability  group.  For  persons 
in  the  high-ability  group,  comparable  correlations  were  r=.63  and  .^=.59, 
respectively. 


Difficulty  Perceptions  of  Individual  Items 

The  second  phase  of  the  analysis  assessed  the  relationship  between  the 
ability  levels  of  testees  and  the  perceived  difficulty  of  a given  item.  As  an 
individual's  ability  level  Increases  relative  to  the  difficulty  level  of  an 
item,  the  item  should  be  perceived  by  the  individual  as  being  relatively  less 
difficult.  As  student  ability  levels  decrease  in  comparison  to  an  item's 
difficulty,  the  item  should  appear  to  the  testees  as  being  relatively  more 
difficult.  Thus,  the  difficulty  rating  assigned  by  a testee  to  an  individual 
item  should  be  dependent  upon  the  discrepancy  between  the  testee's  ability 
level  and  the  item's  difficulty. 


9 


y 


i 


iTiir>cM>^0<*^’^CN'Otn<NCOoCOOr**0''£0‘A^r>..-Hsom.^fnfs4<N  — -«»oor9*cooooo<^'«»ir'. 
OfSrvl-HOOO  — '“•f'^OOO  — — <N«-*00f^J0*-'O—  CvjfNOtNCOOO'^'^f^O 

I lit  III  I III  I I I I I 


— OON'^CNiA-«9CNOOu^Ooof*^vO'^<-r*.fsittu~»fNjo^oo>^—  O^r^oomcvjr«.moot>-0'CT' 

OOOOO— •00«M<NeN  — r9mo^OO-^0<N--<N— • — CN-^OO-^— 'OCM  — COOO— «0<N 


I II  I I I I I I I I I I I I I i I I 


cocN'jrMsOo''^fN«^----*AO’.CiJ^f^O'C'^r>.irio  r*  — ma'O'f*^f^fNfNf^<jrnmry00inc>4 

Of^O«—  fo— «—  oOfNCsjm^*—  — 'iNm  — co  — — C^OCN—  ^OOOCM«^Oc^Of^O— 'O  — 


III  I III  II  III  III 


O'-'Of't-'O'-'  — — — -^Of^OCNOO  — ■—OOfM^'lOO— ' — — 

* I*  ’ I ’ * I I I I I I I I I I I II 


csjr^-^Cr'jr'r^  — oo  — — <M  — — 0<MCN<NfN'-'<N-^  — — 

t I I I I I I I I I I I I I I I II 


\CsCiTirOi,j^ir'kri'^insOv^sC'.C‘y'u^in\Cpi/~,  sO'sj‘^>j'^<Ttr'®>j<yNCi/^u’  >.j*— 


CNr^'JCC0'm-^0'>3''—  '.O'^®^^sC'*■<•^®^■*^|y^Cr^l^^^'•*5fst^«•*^<N0C  — 
^^^fvjfsifn.sjt/>ms0^Dr^aC0'OO— 


Jk‘t/ 


-10- 

% 


Table  2 

Least-Squares  Item  Difficulty  Estimates  Based  on  the 

Difficulty  Perceptions  (x.)  and  Corresponding  Proportion-Correct  (p.) 

J J 


_ 

and 

Normal-Ogive  (b  .) 

tJ 

Item  Difficulty 

Indices 

Low-Ability  Group 

High-Ability  Group 

\ 

Item 

Reference 

Number 

X . 

J 

Pj 

b. 

3 

Item 

Reference 

Number 

X . 

J 

i . 

'7 

■ > 

2 

-.58 

.99 

-3.81 

2 

-.89 

.97 

-3.81 

♦ * 

4 

-.79 

.99 

-5.56 

7 

-1.11 

.98 

-2.32 

7 

-.96 

.97 

-2.32 

14 

-1.10 

.96 

-2.46 

14 

-.80 

.97 

-2.46 

18 

-.60 

.94 

-4.24 

18 

-.59 

.94 

-4.24 

19 

-.95 

.91 

-3.81 

\ 

19 

-.83 

.91 

-3.81 

23 

-.97 

.99 

-3.86 

20 

-1.51 

.96 

-5.76 

24 

-1.15 

.99 

-2.37 

• 

23 

-.58 

.89 

-3.86 

39 

-.45 

.90 

-3.63 

24 

-.83 

.99 

-2.37 

44 

-.32 

.88 

-1.41 

1 

29 

-.59 

.96 

-5.52 

51 

-.09 

.75 

-1.04 

f 

41 

-.71 

.89 

-6.45 

56 

.39 

.47 

. 13 

1 

44 

.06 

.76 

-1.41 

64 

-1.29 

.99 

-2.36 

51 

. 18 

.57 

-1.04 

68 

-.26 

.98 

-2.48 

55 

-.65 

.94 

-4.95 

77 

-.75 

.94 

-3.60 

56 

.57 

.32 

. 13 

86 

-.24 

.82 

-1.  19 

62 

-.94 

.99 

-4.95 

91 

-.17 

.66 

-.20 

64 

-.97 

.97 

-2.36 

104 

.72 

.47 

.05 

68 

-.68 

.92 

-2.48 

108 

-.36 

.75 

-1.16 

72 

-.72 

.97 

-6. 13 

111 

.72 

.34 

.94 

77 

-.29 

.79 

-3.60 

114 

.80 

.28 

.96 

1 

78 

-.55 

.92 

-4.84 

115 

1.23 

. 16 

2.02 

86 

.21 

.59 

-1.19 

120 

-.35 

.37 

1.46 

89 

.01 

.78 

-2.49 

137 

1.  10 

.48 

-.06 

i 

91 

.08 

.56 

-.20 

145 

.40 

.48 

.09 

w 

k 

108 

-.20 

.57 

-1.16 

147 

. 17 

.30 

1.47 

»-  . 

1 1 1 

.74 

.19 

.94 

154 

.07 

.59 

-.12 

n 

114 

.83 

.16 

.96 

162 

1.09 

.21 

1.24 

^ » 
n 

141 

-.05 

.61 

-1.21 

167 

.67 

.41 

2.16 

145 

. 16 

.47 

.09 

174 

.84 

.30 

1.45 

■•V* 

>* 

154 

.22 

.42 

-.12 

182 

-1.01 

.99 

-3.83 

162 

1.12 

.11 

1.24 

188 

.91 

.47 

-.04 

Ml 

174 

.84 

. 18 

1.45 

191 

-.30 

.89 

-1.26 

182 

-.71 

.97 

-3.83 

217 

.97 

.28 

1.38 

188 

1.09 

.31 

-.04 

253 

1.06 

.29 

1.44 

191 

-.16 

.76 

-1.26 

302 

.90 

.51 

.85 

i' 

192 

.36 

.89 

-6.52 

319 

1.09 

.21 

2.  14 

c 

198 

-.51 

.94 

-2.50 

337 

.61 

.42 

1.18 

• 

302 

.93 

.58 

.85 

359 

.59 

. 16 

2.07 

!*• 

337 

.76 

.41 

1.18 

375 

1.36 

.31 

.93 

375 

1.34 

.22 

.93 

383 

.94 

.34 

1.52  j 

651 

.84 

.31 

.89 

514 

.63 

.43 

1.74  I 

-11- 


Table  3 

Correlations  of  Difficulty  Ratings 
with  Ability-1, evel/Item-Difficulty  Discrepancy 


(f) 

and  Dichotomized 

Item  Scores 

(fj, . ) 

h%8 

Low-Ab i 1 ity 

' Croup 

High 

-Ability  Group 

Item  Reference 

r* 

■ 

Dt  S 

Item  Reference  r 

hxs 

Number 

Number 

2 

-.39 

-.19 

2 

-.31 

-1.00 

4 

-.27 

-.67 

7 

-.44 

-.58 

7 

-.31 

-.30 

14 

-.33 

-.36 

14 

-.27 

-.28 

18 

-.21 

-.60 

18 

-.34 

-.24 

19 

-.28 

-.88 

19 

-.26 

-.78 

23 

-.38 

-.67 

20 

-.28 

-.57 

24 

-.22 

-.07 

23 

-.37 

-.58 

39 

-.30 

-.73 

24 

-.27 

-.30 

44 

-.25 

-.34 

29 

-.40 

-1.00 

51 

-.39 

-.55 

41 

-.34 

-.10 

56 

-.38 

-.  75 

44 

-.49 

-.51 

64 

-.27 

.07 

51 

-.49 

-.69 

68 

-.21 

.20 

55 

-.30 

-.30 

77 

-.36 

-.56 

56 

-.40 

-.67 

86 

-.49 

-.66 

62 

-.26 

-.75 

91 

-.44 

-.63 

64 

-.25 

-.15 

104 

-.44 

-.69 

68 

-.17 

.20 

108 

-.41 

-.49 

72 

-.24 

-.73 

111 

-.38 

-.47 

77 

-.  39 

-.47 

114 

-.42 

-.41 

78 

-.56 

-.05 

115 

-.29 

-.56 

86 

-.56 

-.66 

120 

-.31 

-.33 

89 

-.49 

-.85 

137 

-.28 

-.61 

91 

-.34 

-.23 

145 

-.41 

-.48 

108 

-.43 

-.40 

147 

-.13 

-.22 

111 

-.43 

-.32 

154 

-.33 

-.38 

114 

-.43 

-.47 

162 

-.49 

-.72 

141 

-.41 

-.48 

167 

-.23 

-.33 

145 

-.37 

-.16 

174 

-.18 

-.18 

154 

-.51 

-.62 

182 

-.30 

.22 

162 

-.21 

-.23 

188 

-.48 

-.65 

174 

-.23 

-.22 

191 

-.41 

-.60 

182 

-.27 

-.30 

217 

-.23 

-.39 

188 

-.28 

-.50 

253 

.11 

-.20 

191 

-.44 

-.52 

302 

-.  31 

-.43 

192 

-.40 

-.25 

319 

-.41 

-.61 

178 

-.35 

-.76 

337 

-.39 

-.49 

302 

-.03 

-.  37 

359 

-.01 

.14 

337 

-.19 

-.44 

375 

-.15 

- 33 

375 

-.10 

-.06 

383 

-.  36 

-.40 

651 

-.18 

-.30 

514 

-.50 

-.45 

-12- 


Method  of  Analyst s 

The  normal-ogive  testing  model  permits  the  estimation  of  individual  ability 

levels  and  item  difficulty  levels  on  a common  metric.  Thus,  an  estimate  of  the 

discrepancy  between  an  individual's  ability  level  and  an  item's  difficulty  is 

Q--b-,  where  0.  represents  the  ability  level  of  person  t,  and  b.  represents  the 
z j z ^ J 

difficulty  of  item  J. 

To  assess  the  relationship  between  the  abll ity-level/item-d if f iculty 

discrepancy  (0.-2j.)  and  the  testee's  difficulty  perception  for  a single  item 
Z i]  ^ 

(.d . , the  Pearson  product -moment  correlation  (p)  between  G .-b  . and  d..  was 

zj  ^ z j 

computed  for  each  item.  Because  the  estimate  of  0.  and  the  estimate  of  h. 

z J 

are  fallible  and  because  it  is  possible  that  testees'  perceptions  are  more 

directly  related  to  whether  or  not  they  can  answer  the  item  correctly  than  to 

0.-5-,  the  biserlal  correlation  (f,  . ) between  the  testees'  item  scores 
t J bzs 

{0  if  incorrect,  2 if  correct)  and  their  difficulty  perceptions  was  also  computed. 
Results 


Table  3 shows  the  correlations  of  the  G^-b.  discrepancy  and  the  difficulty 
ratings,  d-.,  for  items  on  both  tests.  The  median  correlations  were  -.34  for 

z,l 


the  low-ability  group  and  -.33  for  the  high-ability  group.  Correlations 
ranged  from  -.56  to  -.03  for  the  low-ability  group  and  from  -.50  to  -.11  for 
the  high-ability  group. 


Table  3 also  shows  the  biserial  correlations  of  the  item  scores  and  the 
difficulty  ratings  for  each  test  item.  The  median  biserial  correlations  were 
-.40  and  -.48  for  the  low-  and  high-ability  groups,  respectively.  These 
correlations  ranged  from  -1.00  to  .20  for  the  low-ahility  group  and  from  -1.00 
to  .22  for  the  high-ability  group. 


Perceptions  of  Appropriate  Item  Difficulty 


Adaptive  testing  procedures  generally  tailor  a test  such  that  item  diffi- 
culty parameters  are  somewhat  near  the  estimated  ability  level  for  a given 

testee,  i.e.,  so  that  Q ,-b  . approaches  zero.  Although  these  items  may  be 

Z ,7 

"about  right"  in  difficulty  from  a psychometric  standpoint,  they  may  not  be 
"about  right"  from  the  individual  testee's  point  of  view.  The  third  phase  of 
the  analysis  was  designed  to  determine  the  testee-abillty/item-dif flculty 
discrepancy  for  an  item  which  was  perceived  by  the  testee  as  being  nboiit 

pzaht"  for  him/her. 


Method  of  Analysis 


For  each  test  item,  an  average  6 --h  • was  computed 

z J 

the  item  rating  of  "C",  Indicating  that  they  perceived 
item  as  ''.just  about  right  " for  them. 


for  those  persons  giving 
the  dlfficultv  of  the 


-1  5- 


A. 

Tal'li?  H allows  Llio  avoragL-  J i repaiK'y  of  subjiits  assigning  "l~"'  to  the  item 

lot  I u'h  111  thi  Items  >m  the  two  tests.  It  is  obvious  i nin.  the  li.tta  in  Table  4 that 
tiu  " .'.br’tf  i‘Ljh‘"  pona.'pt  ions  ditfer  gfe.ttly  from  item  to  item. 

Positive  values  of  these  mean  discrepaiu  ies  indiiaie  chat  an  item  was 
perceived  as  "ubout  i-u/n'"  wtien  the  difticuliv  Jevel  of  tiie  item  (I  .)  was,  on 

on  the  aver.ige,  below  t lie  testees'  estimated  ability  level  For  the  low- 

.tbility  group,  28  of  the  41  items  had  positive  me.ui  discrepancies;  tliese 
li  i screpanc  i es  ranged  i i uru  . (4  to  5.77.  For  the  ii  i gh-ab  i I i ty  group,  20  of  the 
41  items  liad  [■iisitive  mean  d isc  repan.- ies , tanging  from  .14  to  4.04. 

Negative  values  indicate  a iuJgment  of  "about  right"  for  items  which,  are 
ab.'Ve  a testee's  ability  Level.  •■or  the  low  ability  grcnp,  these  ranged  from 
-.31  to  -2.04.  For  the  high-ability  g.rouo,  the  range  was  -.06  to  -2.44. 

The  average  signed  mean  d i screpan,-y  Vva.s  t.JbS  tor  the  low-abilltv  testees 
and  .28yy  ror  the  high-ability  testees.  These  average.s  are  s.imewhat  ambiguous 
because  differing  numbers  of  testees  rout  i ibutt  il  to  tlie  .'amputation  of  means 
f.’i  individual  items.  the  over-'.l!  ii'e.in  .iiscrepan.  ies  judged  to  be  " :bou' 

’ weighted  by  the  number  of  per.sons  u|)on  v.'hi.h  e.ich  item  mean  w.ts  based, 

were  I.  /Oi  and  .4o6  for  the  low--  an.i  hi  gh- ab  i I it cr.nip.,,  respectively. 


11  i scuss  ij_m 

least  squares  estimates  of  item  d i t f ten ] !.  i es  , based  on  the  ditficulty 
i.itings  as-  igned  to  Itie  items  and  the  nnid  imeii.slona  i .i  i '' f i i nl  f y-percept  ion 
.iio.iel,  were  closely  related  to  difficulty  indice.s  b.u-e  1 .'n  conven  t iona  1 
responses  to  the  items.  Thus,  students  were  abi.-  to  a.'curately  perceive 
tlie  relative  difiiculties  of  a set  or  test  iteiri.s.  Tliere  was  some  suggestion 
in  the  data  that  high-abi  1 i t;  testees  perceived  ii.-  -i  difficulties  relatively 
iiioie  a.curately  chan  did  low-.ibillty  testees. 

SimLl.ir  1 V,  rat  tugs- based  abilit,  estimates  corresponded  rehttively  well 
i.itli  mure  tradii.ional  ability  estiniales.  Because  these  ra  t i ngs-hased  ability 
e.st  iwate.s  were  e.sseruially  .in  average  'f  j he  diffl.'ultv  ratings  .assigned  to  the 
items,  the  po.sttive  corre  1 at  ii  iis  betw-.ou  the.se  estimates  and,  for  instame, 

Ltie  number-correct  scores  i,’di,  n.-  ' li.it  .> -•  .tbilitv  lev. -Is  incre.i^ed , the  items 
acTc  rated  a.s  being  relaiiv<-|-.  ies.-.  ditficiilt,  on  the  .tverage. 

The  c.jr  ce  I .U  i ons  iitivee.  'he  ra  t i ngs- based  ability  estimates  and  the  number- 
..irreci  scores  .t  1 so  indie. ue  i h.u  testees  can,  with  a fair  degree  of  accuracy, 
poicelve  how  well  they  nave  performed  on  an  ability  test.  Tlie  correlations  of 
.55  tor  tlie  Inw-abiliry  group  .sugge.st  o rlial  studenr.s  in  this  group  were  slightly 
less  able  to  peiceive  their  ability  levels  as  assessed  by  number-correct  scores 
than  were  testees  in  the  liigti-.ibi  1 i I gr.uip,  where  mimber-correc t scores  and 
rat  ings-based  abilit,'  esLi.n.t:,  oiel.'.ied  .ti3.  In  general,  however,  the 
magnitude  of  the  re  1 at  i .nn  h i pv  i,, .tween  tb<-  .li'ti.-ultv  ratiivgs  and  objective 


1 


Table  4 

Mean  Signed  Discrepancy  by  Item  Between  Testee  Ability  and 

Item  Difficulty  (O.-L.)  for  Students  Rating 
I' 

an  Item  "Just  About  Right  for  [me],"  for  Two  Ability  Groups 


Low-Ability  Group 

High-Ability  Group 

Item 

Reference 

Number 

Mean 

Discrepancy 

Number  of 
Students 

Item 

Reference 

Number 

Mean 

Discrepancy 

Number  of 
Students 

2 

2.87 

50 

2 

3.38 

60 

i 

4 

4.63 

48 

7 

1.52 

47 

> 

7 

1.24 

36 

14 

1.68 

51 

14 

1.47 

46 

18 

4.04 

58 

18 

3.37 

53 

19 

3.29 

39 

19 

2.73 

42 

23 

3.16 

61 

20 

4.03 

8 

24 

1.85 

43 

23 

2.97 

54 

39 

3.29 

76 

24 

1.44 

46 

44 

1.15 

101 

i 

29 

4.54 

50 

51 

.79 

90 

• 

41 

5.54 

49 

56 

-.06 

59 

44 

.75 

52 

64 

1.77 

34 

1 

51 

.36 

49 

68 

2.01 

82 

f 

\ 

L 

55 

3.94 

60 

77 

2.96 

76 

56 

-.75 

35 

86 

.77 

60 

1 

62 

4.00 

33 

91 

-.29 

73 

64 

1.37 

39 

104 

.14 

32 

68 

1.46 

53 

108 

.85 

78 

72 

5.13 

42 

111 

-.88 

48 

77 

2.66 

60 

114 

-.87 

48 

78 

3.88 

62 

115 

-1.85 

11 

86 

.61 

37 

120 

-1.92 

88 

4 

89 

1.69 

51 

137 

.42 

31 

91 

-.82 

53 

145 

-.26 

77 

1 . 

108 

.34 

54 

147 

-1.80 

84 

% ^ 

111 

-1.49 

32 

154 

-.15 

95 

114 

-1.25 

32 

162 

-.75 

26 

f 

ft 

141 

.50 

55 

167 

-2.44 

51 

«•  . 

145 

-.73 

43 

174 

-1.37 

46 

n 

154 

-.59 

63 

182 

3.16 

55 

1 * 

ft 

162 

-1.45 

14 

188 

.29 

32 

174 

-2.04 

26 

191 

.94 

73 

182 

2.90 

61 

217 

-1.31 

38 

188 

-.31 

14 

253 

-1.99 

27 

191 

.46 

49 

302 

-.96 

40 

192 

5.77 

47 

319 

-1.59 

29 

198 

1.59 

57 

337 

-1.29 

63 

302 

-1.22 

20 

359 

-2.35 

42 

337 

-1.66 

35 

375 

-.62 

15 

t 

375 

-1.37 

11 

383 

-1.20 

49 

651 

-1.59 

29 

514 

-1.64 

56 

-15- 


estimates  of  item  difficulty  and  between  the  ratings  and  estimates  of  testees' 
abilities  indicates  that  testee  perceptions  of  test  difficulty  and  their 
test  performance  are,  at  least  generally,  accurate. 


The  second  phase  of  the  analysis  showed  that  for  an  individual  item, 
however,  there  was  relatively  little  relationship  between  testee  perceptions 
of  item  difficulty  and  testee-ablllty/item-dlf f iculty  discrepancies  or  the  item 
scores.  The  median  proportions  of  variance  accounted  for  by  the  linear  rela- 
tionship between  the  Q ,-b  . discrepancy  and  the  difficulty  perceptions  (r*^) 

3 

were  only  .12  and  .11  for  the  two  ability  groups.  The  median  proportions  of 
variance  accounted  for  by  the  relationship  between  th^  dichotomized  item  scores 


and  the  difficulty  perceptions 


bis 


were  .16  and  .23  for  the  two  groups. 


In  these  latter  data,  however,  there  again  seems  to  be  a difference  in  favor  of 
the  high-ability  group  in  that  their  difficulty  perceptions  were  more  highly 
related  to  their  test  behavior. 


The  finding  most  relevant  for  the  design  of  ability-testing  procedures  was 
that  items  which  were  judged  by  the  testees  to  be  "about  right"  in  difficulty 
were  not  necessarily  "about  right"  from  a psychometric  point  of  view.  These  data, 
in  fact,  show  that  testees  perceived  items  that  were  somewhat  below  their  ability 
levels  as  being,  on  the  average,  about  right  for  persons  of  their  ability  level. 

In  the  case  of  the  low-ability  students,  the  items  perceived  as  appropriate  had, 
on  the  average,  normal-ogive  difficulty  parameters  which  were  over  1.5  standard 
deviations  below  the  testees'  maximum  likelihood  ability  estimates.  The  high- 
ability  students  judged  items  as  "about  right"  if,  on  the  average,  they  were 
about  one-half  standard  deviation  below  their  ability  levels.  Low-ability 
students  tended  to  judge  items  as  "about  right"  in  difficulty  when  the  items 
were  below  their  ability  levels;  the  high-ability  students  divided  their  "about 
right"  judgements  equally  between  items  which  were  psychometrically  too  easy  and 
those  which  were  psychometrically  too  difficult. 

Conclusions 


These  data  show  that  students'  perceptions  of  the  relative  difficulties  of 
a set  of  ability  test  items  are  quite  accurate,  but  that  their  perceptions  of 
the  difficulties  of  individual  ability-test  items  are  only  moderately  accurate. 
The  data  alao  suggest  that  the  ability  level  of  the  testee  has  some  effect  on 
difficulty  perceptions.  Ability  level  also  is  related  to  the  accuracy  of 
perception  of  a testee's  own  test  score.  Thus,  testees  of  different  ability 
levels  seem  to  encounter  a different  psychological  environment  when  interacting 
with  an  ability  test.  This  conclusion  is  further  supported  by  the  students' 
perceptions  of  the  items  which  are  "about  right"  for  their  ability  levels. 

The  psychometric  and  the  psychological  effects  of  adapting  an  ability  test 
to  a level  where  the  testee  perceives  the  test  difficulty  as  "about  right" 
should  be  studied.  Adaptive  testing  strategies  usually  tailor  a test  such  that 
the  estimated  difficulty  of  each  item  administered  is  close  to  the  current 
estimate  of  an  Individual's  ability  level.  In  adapting  a test  to  ensure  that 
item  difficulties  are  psychometrically  optimal,  these  strategies  may  also,  in 
effect,  be  tailoring  the  test  so  that  all  of  the  items  are  perceived  by  testees 
as  being  too  difficult  for  persons  of  their  ability  level.  The  psychological 
effects  of  such  a procedure  should  be  investigated  more  fully. 


-16- 


REFERENCES 


Betz,  N.  E.,  & Weiss,  D.  J.  Effects  of  immediate  knowledge  of  results  and 
adaptive  testing  on  ability  test  performance.  (Research  Rep.  76-3). 
Minneapolis:  University  of  Minnesota,  Department  of  Psychology,  Psycho- 

metric Methods  Program,  June  1976.  (AD  A027147)  (a) 

Betz,  N.  E. , & Weiss,  D.  J.  Psychological  effects  of  Immediate  knowledge  of 

results  and  adaptive  ability  testing.  (Research  Rep.  76-4).  Minneapolis: 
University  of  Minnesota,  Department  of  Psychology,  Psychometric  Methods 
Program,  June  1976.  (AD  A027170)  (b) 

Bratfisch,  0.,  Borg,  G.,  & Domic,  S.  Perceived  item-difficulty  in  three  tests 
of  intellectual  performance  capacity.  Stockholm:  University  of  Stockholm, 

Institute  of  Applied  Psychology,  1972  (29). 

Bratfisch,  0.,  Domic,  S.,  St  Borg,  G.  Perceived  difficulty  of  items  In  a 

test  of  reasoning  ability.  Stockholm:  University  of  Stockholm,  Institute 

of  Applied  Psychology,  1972  (28). 

DeWitt,  L.  J.,  & Weiss,  D.  J.  A computer  software  system  for  adaptive  ability 

measurement.  (Research  Rep.  74-1).  Minneapolis:  University  of  Minnesota, 

Department  of  Psychology,  Psychometric  Methods  Program,  January  1974. 

(AD  773961) 

Holtzman,  W.  H.  Individually  tailored  testing:  Discussion.  In  W.  H.  Holtzman 

(Ed.),  Computer-assisted  instruction,  testing  and  guidance.  New  York: 

Harper  & Row,  1970. 

Jensema,  C.  A simple  technique  for  estimating  latent  trait  mental  test  parameters. 
Educational  and  Psychological  Measurement,  1976,  ^(3),  705-716. 

Lord,  F.  M.  Some  test  theory  for  tailored  testings.  In  W.  H.  Holtzman  (Ed.), 
Computer-assisted  instruction,  testing  and  guidance.  New  York:  Harper  & 

Row,  1970. 

Lord,  F.  M. , & Novick,  M.  R.  Statistical  theories  of  mental  test  scores. 

Reading,  MA:  Addison-Wesley , 1968. 

McBride,  J.  R. , & Weiss,  D.  J.  A word  knowledge  item  pool  for  adaptive  ability 
measurement  (Research  Rep.  74-2).  Minneapolis:  University  of  Minnesota, 

Department  of  Psychology,  Psychometric  Methods  Program,  June  1974. 

(AD  781894) 

Munz,  D.  C. , & Jacobs,  P.  D.  An  evaluation  of  perceived  item-difficulty 

sequencing  in  academic  testing.  British  Journal  of  Educational  Psychology, 
1971,  195-205. 


-17- 


Urry,  V.  W.  Ancillary  estimators  for  the  Item  parameters  of  mental  test  models. 
Paper  presented  at  the  Annual  Meeting  of  the  American  Psychological 
Association,  Washington,  DC,  August  1975. 

Weiss,  D.  J.  Strategies  of  adaptive  ability  measurement.  (Research  Rep.  74-5). 

Minneapolis:  University  of  Minnesota,  Department  of  Psychology,  Psychometric 

Methods  Program,  December  1974.  (AD  A004270) 

Weiss,  D.  J. , & Betz,  N.  E.  Ability  measurement:  Conventional  or  adaptive? 

(Research  Rep.  73-1).  Minneapolis:  University  of  Minnesota,  Department 

of  Psychology,  Psychometric  Methods  Program,  February  1973.  (AD  757788) 


-18- 


APPENDIX  A 


Item  Calibration  Procedures 
Initial  Item  Parameter  Estimates 

The  item  parameterization  procedures  that  were  used  assumed  a normal-ogive 
latent  trait  model  and  the  existence  of  a bivariate-normal  joint-distribution 
of  9 (levels  of  the  latent  ability)  and  x (the  continuous  variable  assumed  to 
underlie  the  dichotomous  item  responses).  Given  these  assumptions,  discrimina- 
tion (a)  and  difficulty  (b)  parameters  may  be  defined  by  Equations  6 and  7, 


a . 


[6] 


b . 

J 


[7] 


where  is  the  correlation  between  individuals'  ability  levels  (0)  and  their 

j 

scores  (x)  on  item  j. 


Y . is  the  a-score  above  which  lies  the  proportion  of  testees  in  the  pop- 

V 

ulatlon  knowing  the  correct  answer  to  item  j (Lord  & Novick,  1968). 


In  order  to  estimate  Pq  the  blserlal  correlation  (f  .)  between  testees' 

ability  levels  and  their  dichotomized  item  scores  was  found  by  first  estimating 

the  point-biser ial  correlation  (r* .)  between  ability  levels  and  dichotomous  item 

J 

scores  by  Equation  8,  based  on  data  reported  by  McBride  and  Weiss  (1974), 


= (x  - X ) \/Tp-)  0-p  .)/s 

+ - .7  .7  3 


where  x^  is  the  mean  number-correct  score  of  persons  correctly  answering  item  j, 

X is  the  mean  number-correct  score  of  persons  incorrectly  answering  item  j, 
p.  is  the  proportion  of  persons  correctly  answering  item  j, 
s is  the  standard  deviation  of  number-correct  scores  for  the  total 
group  answering  item  j. 


The  biserial  coefficient  was  then  computed  using  the  transformation  in  Equation  9, 


-19- 


where  z . is  tlie  3-score 
iiorming  sample 


above  which  lies  the  proportion  of 

correctly  answering  item  (p  .)  , 

,1 


testees  in  the 


is  the  density  of 


normal  probability  density 


function  at  3.. 

.7 


Because  a testee  could  answer  an  Item  correctly  simply  by  random  guessing 
on  these  5-alternative,  multiple-choice  items,  a guessing  parameter  (e)  was 
defined  for  each  item  by  Equation  10. 

c.=  l/n.  [10] 

J t/ 

where  n.  is  the  number  of  response  alternatives  on  item  j. 


In  order  to  account  for  guessing  when  the  initial  a and  h parameters  used 
to  construct  the  tests  described  in  this  report  were  derived,  the  estimate  of 
p.  (e.)  computed  in  Equation  9 was  modified  according  to  Equation  11, 

0 

v'.=r  ./U-o  .) . [11] 

J J J 


The  estimate  of  p 


resulting  from  Equation  11  (r')  was  restricted  to 

tJ 


the  interval  from  -1.0  to  +1.0  and  used,  along  with  3.  (as  an  estimate  of  Y .) , 

J P 

to  calculate  values  of  a and  h for  each  item  using  Equations  6 and  7.  The 

resulting  values  of  a.  were  then  restricted  to  the  interval  from  -3.0  to  +3.0. 

3 

The  restrictions  on  r'.  and  a.  thus  affected  both  the  values  of  the  a and  b 
parameters  but  the  effects  of  the  restrictions  were  not  necessarily  consistent. 


Revised  Item  Parameter  Estimates 


The  item  parameter  estimates  derived  from  the  above  procedures  were  used 
to  select  items  for  the  tests  administered  in  this  study.  In  the  time  interval 
between  the  construction  of  the  tests  and  tlie  analysis  of  the  data,  it  became 
apparent  that  certain  revisions  to  these  item  parameter  estimates  were  necessary 
for  each  item.  These  revised  estimates  were  computed  for  all  569  items  in  the 
pool  from  which  items  for  this  study  were  selected. 


In  computing  the  revised  estimates  of  a and  b used  to  analyze  the  present 
data,  the  proportion  of  testees  who  actually  knew  the  correct  answer  to  an  item 
ip'.)  was  estimated  from  the  proportion  of  testees  in  the  population  who  actually 

tJ 

answered  the  item  correctly  (p .)  and  the  estimate  of  using  Equation  12, 

3 3 


(p  .-c’  .). 

*J  tJ  d 


[12] 


-20- 


S 

j 

t 


An  estimate  of 


i> . Vfp  J 

(1-j  .) 

J 


p suggested  by  Urry  (1975)  was  then  computed  by  Equation  13, 
(1-p.) 

^ [13] 

«/ 


where  3 'I  is  the  s-score  above  which  lies  the  proportion  of  testees  in  the  sample 
who  were  estimated  to  t ** 'cz?  Z;-'  know  tlie  answer  to  item  j (pT), 

<p[z']  is  the  density  of  normal  probability  density  function  at  z'.. 

J J 

This  estimate  of  pg^,  was  then  used,  along  with  p'.  as  an  estimate  of  , 

“ J 

in  Equations  6 and  7 to  calculate  the  revised  a and  b parameters.  If  p .<a 

3 t) 

p'.  was  set  equal  to  .001.  If  | „?•'[ | >. 9486833 , r'.  was  set  equal  to  .9486833 

3 ^3  3 J 

with  the  appropriate  sign.  This  restricted  the  a-values  to  the  interval  from 
-3.0  to  +3.0  and  influenced  the  fe-values  through  Equation  7. 


This  latter  procedure^  differs  from  that  suggested  by  Jensema  (1976)  only 
in  that  Jensema  chose  to  remove  each  item  from  the  computation  of  the  test 
score  estimating  0 during  the  computation  of  that  Item's  parameters.  For  test 
scores  based  on  ]arge  numbers  of  items,  the  effects  of  this  exclusion  should  be 
negligible. 


Comparison  of  Original  and  Revised  Item  Parameters 

For  items  in  the  pool  with  b parameters  between  ?3.0,  Figure  A-1  presents 
the  bivariate  plot  of  the  original  and  the  revised  b parameters.  As  Figure  A-1 
shows,  the  revised  b estimates  were  closely  related  to  the  original  fc-values 
(Pearson  product-moment  J'=.98).  The  bivariate  plot  of  original  and  revised 
a-values  is  shown  in  Figure  A-2 . As  this  figure  shows,  the  revised  a-values 
were  not  as  closely  related  to  the  original  a-values  (Pearson  product-moment 
r>=.74)  as  were  the  revised  fc-values. 


To  determine  the  effects  of  the  revised  item  parameters  on  ability  estimates 
computed  using  those  parameters,  maximum  likelihood  ability  estimates  were 
computed  using  both  sets  of  item  parameters  for  the  185  CT.A  students  involved 
in  this  study.  The  bivariate  plot  of  the  two  sets  of  maximum  likelihood  ability 
estimates  is  shown  in  Figure  A-3.  The  resulting  Pearson  product-moment  corre- 
lation of  .96  indicated  that  the  ability  estimates  did  not  differ  greatly  depending 
on  whether  the  original  or  revised  normal-ogive  item-parameter  estimates  were 
used.  This  high  correlation  suggests  that  essentially  the  same  conclusions 
would  be  drawn  in  this  study  from  the  use  of  either  the  original  set  of  item 
parameters  or  the  revised  set  of  parameter  estimates  based  on  Ilrry's  (1975) 
correction  procedure. 


^These  procedures  were  suggested  by  James  B.  Sympson  of  the  University  of 
Minnesota. 


Figure  A-1 

Joint  Distribution  of  Original  and  Revised  Difficulty 
Parameter  (b)  Estimates 


f\J  • • * * 

» • • • 


fO  * fO 

CM  * 

* xC  • ^ 

^ K>  # 

^ K)  • 

CM  *0  O 

* /) 


<\l 


i 


I 

i 


aamuiasg  q xbut3tjo 


Revised  b Estimate 


Figure  A- 3 

Joint  Distribution  of  Maximum-likelihood  Ability  Estimates  (0) 
Based  on  the  Original  and  the  Revised  Item-parameter  Estimates 


-2  3- 


I 


i': 


■t  oj 

^ # 

m CO 

•o  • 


oa  s 
<T 

*0  f\i 


• rsi 

o * 


r 

t- 


sa3  3PmHat?a  uiaqi 
1;eut8ijo  uo  pascg  q 


0 Based  on  Revised 
Item  Parameters 


-24- 


APPENDIX  B 


Tabic  B-1 

Order  of  Administration  and  Normal  Ogive  Di- criminat ion  (a)  and 
Difficulty  (M  Parameters  for  Items  on  Tests  for  the  Low-  and  High-Ability  Groups 


Low-Ability  Group High-Ability  Group 


Item  Reference 
Number 

Item  Sequence 

Item  Parameters 

Item  Reference 
Number 

Item  Sequence 

Item  Parameters 

A 

B 

c 

D 

j 

A 

B 

c 

D 

a 

2 

11 

32 

37 

16 

.517 

-3.810 

; 

41 

7 

27 

21 

.517 

-3.810 

4 

24 

24 

10 

38 

.397 

-5.561 

7 

39 

8 

26 

22 

3.000 

-2.324 

7 

3 

3 

3 

3 

3.000 

-2.324 

14 

22 

26 

8 

40 

2.208 

-2.46! 

14 

40 

9 

25 

23 

2.208 

-2.461 

18 

1 

1 

1 

1 

.483 

-4.241 

18 

41 

7 

27 

21 

.483 

-4.241 

19 

28 

20 

14 

34 

.710 

-3.808 

19 

16 

37 

32 

1 1 

.710 

-3.808 

23 

18 

39 

30 

9 

.713 

-3.862 

20 

1 

1 

1 

1 

.381 

-5.764 

24 

30 

IP, 

16 

32 

1.749 

-2.366 

23 

22 

26 

8 

40 

.713 

-3.862 

39 

5 

5 

5 

5 

.347 

-3.625 

24 

13 

34 

35 

14 

1.749 

-2.366 

44 

32 

16 

18 

30 

1 . 145 

-1.412 

29 

25 

23 

11 

37 

.323 

-5.521 

51 

27 

21 

13 

35 

1.432 

-1.043 

41 

7 

28 

41 

20 

.272 

-6.450 

56 

34 

14 

20 

28 

1 . 109 

.135 

44 

15 

36 

33 

12 

1 . 145 

-1.412 

64 

23 

25 

9 

3° 

3.000 

-2.363 

51 

34 

14 

20 

28 

1.432 

-1.043 

68 

15 

36 

33 

12 

1.014 

-2.479 

55 

29 

19 

15 

33 

.288 

-4.953 

77 

10 

31 

38 

17 

.442 

-3.602 

56 

17 

38 

31 

10 

1.109 

.135 

86 

7 

28 

41 

20 

.887 

-1.189 

62 

18 

39 

30 

9 

.426 

-4.952 

91 

25 

23 

1 1 

37 

1.132 

- .19' 

64 

39 

8 

26 

22 

3.000 

-2.363 

104 

3 

3 

3 

3 

.944 

.050 

68 

6 

6 

6 

6 

1.014 

-2.479 

108 

8 

29 

40 

19 

.536 

-1.155 

72 

5 

5 

5 

5 

.274 

-6.134 

111 

33 

15 

19 

29 

.822 

.936 

77 

32 

16 

18 

30 

.442 

-3.602 

114 

36 

12 

22 

26 

3.000 

.960 

78 

9 

30 

39 

18 

.437 

-4.843 

115 

2 

2 

2 

2 

3.000 

2.023 

86 

23 

25 

9 

39 

.887 

-1.189 

120 

38 

10 

24 

24 

3.000 

1 .464 

89 

35 

13 

21 

27 

.721 

-2.493 

137 

6 

6 

6 

6 

.499 

- .056 

91 

30 

18 

16 

32 

1.132 

- .197 

145 

35 

13 

21 

27 

.791 

.086 

108 

33 

15 

19 

29 

.536 

-1.155 

147 

17 

38 

31 

10 

.825 

1.469 

111 

19 

40 

29 

8 

.822 

.936 

154 

26 

22 

12 

36 

.872 

- .124 

114 

8 

29 

40 

19 

3.000 

.960 

162 

31 

17 

17 

31 

3.000 

1 .245 

141 

38 

10 

24 

24 

.478 

-1.208 

167 

24 

24 

10 

16 

.416 

2.155 

145 

10 

31 

38 

17 

.791 

.086 

174 

16 

37 

32 

11 

3.000 

1.455 

154 

31 

17 

17 

31 

.872 

- .124 

182 

1 1 

32 

37 

16 

.703 

-3.833 

162 

37 

11 

23 

25 

3.000 

1.245 

188 

21 

27 

7 

41 

.970 

- .036 

174 

36 

12 

22 

26 

3.000 

1.455 

191 

37 

11 

23 

25 

1.749 

-1.257 

182 

21 

27 

7 

41 

.703 

-3.833 

217 

12 

33 

36 

15 

1.249 

1.384 

188 

26 

22 

12 

36 

.970 

- .036 

253 

14 

35 

34 

13 

2.321 

1 .443 

191 

12 

33 

36 

15 

1.749 

-1.257 

302 

19 

40 

29 

8 

.845 

.846 

192 

20 

41 

28 

7 

.267 

-6.518 

319 

40 

9 

25 

23 

3.000 

2.138 

198 

14 

35 

34 

13 

.801 

-2.503 

337 

9 

30 

39 

18 

3.000 

1.181 

302 

2 

2 

2 

2 

.845 

.846 

359 

29 

19 

15 

33 

3.000 

2 .066 

337 

4 

4 

4 

4 

3.000 

1 . 181 

375 

20 

41 

28 

7 

.832 

.93- 

375 

27 

21 

13 

35 

.832 

.934 

383 

13 

34 

35 

14 

2.111 

1.518 

651 

28 

10 

14 

34 

1.087 

.885 

514 

4 

3 

4 

4 

1 .158 

1.741 

'I 


r 


I 


n 


■< 

1 Or.  '1,'ir'5han  J.  rarr,  Director 

t’crsonr.el  S Tra’olnq  Research  Proyrd.i’S 
i.ffice  of  f.'avay  Research  (Code  458) 
'rliryton,  VA  22iM7 

1 ,0  •V’r'h  Office 

, ..  r Stri  't 

ioston,  ilA  02210 
Attn:  Dr.  Ja-es  Lester 

1 ChR  Prarch  Office 
!D30  East  arcen  Street 
Pasjde-3,  CA  91101 
Attn;  O'".  Eugene  Gloye 

t i.',R  Ci  it.;.h  Office 
536  S.  Llrri,  Street 
Chicago,  IL  60605 
Attn:  Dr.  Charles  E.  Davis 

1 Dr.  M.  A.  Oertin,  Scientific  Director 
^f‘ice  cf  Caval  Research 
Scieiuific  Liaison  Group/ToLyo 
•'  ricar  L t>,’5‘-y 
,T0  San  Fru-, cisco  96503 

1 t'ffica  of  haval  Research 
Coiie  2C0 

Arlington,  V 22217 

C r -.rranhirg  Officer 

naval  R’-'Si.arch  Laboratory 
Cede  2627 

V'ashington,  DC  20390 

1 UTD  Charlc'  J.  Thoisen,  Jr,,  f'SC,  USM 

itaval  Air  Develocr.ent  Center 
h'ar-’irster,  PA  1P974 

1 Co"j::anJinr,  Officer 

I'.S.  h'a.al  if 'ous  School 
CororaJu,  CA  ’_136 

1 CDR  Paul  0.  I.elson,  FSC,  L'S‘! 

naval  f'etiical  ~'0  Coer,  i 'Code  ^4) 
flatioral  '.c". ai  "e-jicai  C- nter 
Botnesda,  i’D  22014 

1 Co'rarding  Officer 

haval  Health  P"search  Center 
San  Diego.  CA  32152 
Attn:  Litrary 

1 C^air"'an,  Leale'*Sh’P  4 Law  Dept. 

.iv,  of  Professional  Dc-. eloprert 
J.  S.  "aval  ■‘c-'dery 
■ ar'.l  IS,  :.D  214G2 

t; ‘i'.  ..  .cr  to  t.  . Chief 

of  ’i.al  Per  •'•pgi  Or) 

' a . .1  C .r'  rj  o'  F erscr  rel 
• tti  .,  Arlirgtcn  Annex 

■ iat'iir  ;ton,  DC  2037C 

1 Ir.  ,'j(i  R.  Dors  ting 
Fru.cst  4 ■‘cad-  ic  Dean 
U.  2.  '.'ival  Fes tgraduats  Scfcol 

■ ortercy,  CA  93940 

I'r.  Maurice  Callahan 
nOOAC  (Cede  2) 

Dept,  cf  tlie  Navy 

Cld'.l.  2,  Washington  Davy  vard 

(Aracost'a) 

'.ash.in  jtor , DC  20374 


DISTRIBUTION  LIST 

1 Office  of  Civilian  Personnel 
Code  342/02  DAP 
V.ashi  injton  , DC  20390 
AtLn;  Dr.  Richard  J.  Mthaus 

1 Office  OT  Civilian  Personnel 
Code  262 

l.'jsningten,  DC  20T90 

1 '■jperintciitlenL  (Code  1424) 

Naval  Postgraduate  School 
I'cnterey,  CA  93940 

I Ur,  H.  n.  West  III 

Deputy  ADC'IO  fnr  Civilian  Planning 
and  Pi  ourarxing  (Acting) 

Poem  202S,  Arlington  Annex 
Wasiiinqton,  DC  20370 

1 Mr.  George  H.  Crainc 
Naval  Sea  Systems  Coci.vnd 
SEA  047C12 

Washington,  DC  20362 

1 Chief  cf  Naval  Technical  Training 
Naval  Air  Station  I emphis  (75) 
Millington,  TN  38054 
Attn.  Dr.  Norman  J.  Kerr 

1 Principal  Civilian  Advisor 

for  Education  and  Training 
Naval  Training  Cermand,  Code  OOA 
Pensacola,  FL  32226 
Attn’  Dr.  William  L.  "alny 

1 :.r.  - If  td  F.  ; t s.,  Dii^.pw^r 

T r.t i r.  1 [. V AualjSis  ^ Evaluuticn  Group 
Ec-trf -“t  cf  tre  Navy 


1 Chie'  c'  'oval  E'I'.caticn  erd 
T roinir  g S'  ■.  ■'ort  (Cl  - ) 

Ftrsatpla,  FI.  j2:c.a 

1 : oval  L’lcerseu  Cei  ter 

Ce'e  203 

San  Cieco,  CA  92132 
Attn;  ‘/I.  Gary  Thomson 

1 Novy  Personnel  F ,j  Center 

r-  :c  jl 

San  [ier:,  CA  9215a 

5 A.  A.  i jorol  . , He-id , Technical  Supi  or 
Now  R’ rb,.,  wqI  c ’D  Center 
Coca  ,.u1 

San  C’rgo,  CA  22152 

2 Navy  P.'rsonrel  NSD  Center 
Code  310 

S.-r  Sir  so,  CA  92152 
/Ltn;  Dr.  f'ortin  F.  Wiskoff 


1 Cr.  Fobert  Morrison 

Navy  Personnel  PAD  Center 
Code  301 

San  Diego,  CA  92152 

1 Navy  Personnel  PSD  Center 
San  Diego,  CA  92152 
Attn:  Library 

1 Navy  Persornal  P4D  Center 
San  Diego,  CA  92152 
Attn;  Dr.  J.  D.  Fletcher 


I Off itar- i :;-C.‘  i e 

Navy  Occupoticnal  Develcprert  A 
M^pl..  i ■ s Cci  tf  r (■  "...  C ) 
iuiluin  ijC,  jsrin ’ten  t.-vy  Ycra 
(Ar  oc  J.stia ) 
rosr.ington,  IC  ' ' 37-; 

1 Dr.  John  Ford 

N,.vy  Personnel  PiD  Center 
Sa.ii  Diego,  CA  22122 

1 Cr.  Worth  Scanlcnd 

Chief  of  Navdl  Educatirn  •.  'rai'^rq 
NAS,  Ptn.sacola,  FL  32523 

Army 

1 Technical  Director 

U.3.  Amy  Research  Institute  for  the 
Behavioral  !•  Social  Sciences 
1300  Wilson  Doulevard 
Arlington,  VA  222a'J 

1 Ar.r.ed  Forces  Staff  College 
Norfolk,  VA  23511 
Attn:  Library 

1 Cc'^apdant 

U.  S.  Ar'”y  Ir.ftntry  School 
Fci  t I'enrir,,,  GA  312-05 
r-.LLn:  A'Sl'.-I-V-IT 


. S.  Ar. 7 Institute  of 
■ n. : 2 A 

Fort  Lai.jan.in  Hjrrissn, 

1 Cr.  Pal:h  Ljseh 

J.S.  7r-y  . arc:  I.nst  ' 
1 .'."O  '..'■'1  .-cr.  .'Salevsrd 


r.r  i ' I 


' on , 


1 Dr. 
" s 


“...r'-c  1 


Inst 

13C0  '’.risen  ’’psUvard 
Arlincton,  VA  2.'2C0 


1 2r. 


- rH 


13''0  ..il  -r:  cCu'i.-’.-sro 
Arliw;'  r,  ,A  2'  22 

1 Or.  Ralph 

LI.S.  ,'-y  Rv . u;i  >s 
1220  Wils;;i  '■  ’ ..;nd 
.'rli:  jtor,  , ...-3 


-Lc 


1 Dr.  J,.!res  L.  Fa-u-y 

U.S.  Arr7  Rcsear':;’.  Inst't.,te 
! 300  '.n  1 son  ;'’o'ilo.’irj 
Arlrwcton.  '.A  222’39 

1 2n.  MiltPi  D.  • atz.  Chief 

Individual  Training  s Perf,-, , -arce 
Fvaliiatioti  Technical  'roa 
U.D.  ‘V.-y  Research  Irstitwte 
1300  '.'I'ilson  ’Joule. ard 
Arlington,  VA  22209 

1 dS'-Ry.R  S 7th  Ar-j 
■ . PS 

USArlE’.R  Director  cf  GEO 
AHO  New  VcrI.  094UJ 

1 ARI  Field  Lint  - Leavenworth 
P.  0.  Box  3122 
Ft.  Lecveiv„>i’th,  hS  ee027 


1 rCDR.  ISAAli^'I’iCE.'l 
IJg.  *1  , -'.310 
Attn.  At?)-0ED  Library 
Ft.  Rtnjar.in  Harriton,  Itt  A6216 

Air  _Fprce 

1 Rtsoarch  Branch 
AF.Tf/DPMVP 

Randolph  AFB,  FX  78148 

1 aphRL/AS  (Dr.  G,  A.  Eckstrand) 
nr  ig.Tt-Patterson  AFB 
:hio  4E433 

I Dr.  I'arty  Rod',  ay  (AFHPL/TT) 
t c ■ 3 
Dolcrado  ECL’30 

I Instructional  Technology  Branch 
.tniRL 

Lowrv  AFB,  CO  R0230 
ATTR:  Valor  Brian  Waters 

1 Dr.  Alfred  R.  Fregly 
AFOSR/'l,  3'jilding  410 
Eel  ling  AFG,  DC  20332 

1 Dr,  Fylvia  R.  I'ayrr  (I'CIT) 
hQ  Electronic  Systerr’S  Division 


1 r.  Frcdtricl  '.I.  Eoffa 

Chief,  Recruiting  and  Retention  Evaluation 
Office  of  the  fssistcint  Secretary  c.f 
Defense,  C3KA 
Dooi.i  32070,  Fertagon 
l.’jshinijtcn,  DC  2C331 

12  Cei'eiise  Docuit'eiilaeion  Center 
Cameron  Station,  Bldg,  b 
AU-'andria,  V.A  22314 
Attn:  FC 

1 l.ilitary  Assistant  for  Human  Resources 
Office  of  ttie  Director  of  Defense 
Research  & Enginoering 
Foor.  3D129,  The  Pentagon 
l.'aslii  ngton , DC  20301 

1 Director,  I'anarer.ent  Inforiration 
Systems  Office 
OSD,  tliRA 

Room  3E91/,  the  Pentagon 
I’ashington,  DC  20301 


Other  Governrent 


1 Or.  Lorraine  D.  Eyde 
Personnel  R&D  Center 
U.S.  Civil  Service  Comni ssion 
1900  E Street  .NW 


1 .4  1 J VV><  ' ' I . 

!‘'crj,  "A  31730 

Washington,  DC  2C415 

/p  J 

1 Dr.  William  Gorham,  Director 

.'0  .3 

Personnel  RSD  Center 

.'.and  «r3,  FX  78236 

U.S.  Civil  Service  Commission 

I9G0  E Street  MW 

^.r  Xa.  e S . Sell  .>n 

Washington,  DC  20415 

Chief,  FersOnr.tl  Testing 
AF'  PC.'OF.  lO 

OirJoUn  ■••FR,  TX  7E148 

1 Air  ' ni  versi  ty  Library 
iUL/fSE  7C-443 
Mavwell  AFB,  AL  ’6112 

I'ci'ine  Corps 

1 Director,  Office  of  FoPpower 
il  ic; - n 

; "arine  Corps  (Code  !!PU) 
lCE,  Duilding  P009 
Quant'iCO,  in  22134 

1 Dr.  A.  L.  Slafkosky 
Scientific  Advisor  (Code  RD-1) 

I'Q,  0.5.  "arine  Corps 
l.ash.i  rgton  , DC  20380 

Covt  Guard 

I t'r.  Joseph  J.  Cov/an,  Chief 

Pij\.f.clogical  Research  Erai.ch  (C-P-1/63) 
U.S.  Cr.as*  '■•uard  Headguarters 
’..ashington,  DC  2CSOO 

C thcr  DoD 

1 Dr.  Harold  F . O'Hei 1 , Jr. 

Advanned  Research  Projects  Agency 
Cynernetics  Technology,  Poor  623 
1400  Wilson  Elvd. 

A-lingtcn,  VA  22209 

' Fr.  Robert  Ycjng 

Advinced  Pesnarch  Projects  Agency 
' ■ Cj  I'i ‘i  : . r.l 
Arlington,  VA  2220? 


1 Dr.  Vern  L'rry 
Personnel  PSD  Center 
L'.S.  Civil  Service  Cummission 
l.-;0  E Street 
Washington,  DC  2C415 

1 Dr.  Andrew  R.  "olnar 
Science  EJucation  Dev.  & Res. 
'.ational  Science  Foi.'ndatlon 
Washington,  DC  20550 

1 U.S.  Civil  Sr.-vice  Co;xiission 
Federal  Office  Building 
Chicago  Regional  Staff  Division 
regional  Psychologist 
230  S.  Dearborn  Street 
Chicago,  IL  60604 
Attn;  C.  S.  Winiowict 

1 Dr.  Joseph  L.  Young,  Director 
Memory  A Cognitive  Frccesses 
National  Science  Foundation 
Washington,  DC  20550 

liiscellarecus 

1 Dr.  John  R.  Anderson 
Dept,  of  Psychology 
Yale  University 
New  Haven,  CT  06520  • 

1 Or.  Scarvia  B.  Anderson 
Educational  Testing  Service 
Suite  1040 

3445  Peachtree  Road  NE 
Atlanta,  GA  30326 

1 Professor  Earl  A.  Alluisi 
Code  267 

"",  t.  of  Fs;Cclciy 
Old  Dc:.,iritn  Lniversity 
Norfolk,  VA  235C.3 


1 I'r.  Samuel  Call 

Educational  Testing  Service 
Princetcr,  tlj  0DS40 

1 Dr.  Gerald  V.  Derrett 
University  of  Akron 
Dept,  of  Psychology 
Akron,  CH  44325 

1 Dr.  Bernard  II.  Bass 
University  of  Rochester 
Graduate  School  of  f'anagenent 
Rochester,  NY  UC27 

1 Dr.  John  Seeley  Drov.n 

Colt  Efrarpi  and  Icvan,  Inc. 

50  Vculton  Street 
Cartridge,  !'A  021 32 

1 Cr.  Forald  P.  Carver 
Schcol  of  Education 
University  of  Kissouri-Kansas  Citj 
5100  Rockhill  Poad 
Kansas  City,  MO  64110 

1 Century  Research  Corporation 
4113  Lee  I'inhv.ay 
Arlington,  VA  22207 

1 Dr.  Kenneth  E.  Clark 

College  of  Arts  S Sciences 
University  of  Rochester 
River  Campus  Station 
Rochester,  MY  14627 

1 Dr.  Norman  G1 iff 
Dept,  of  Psychology 
University  of  Southern  California 
University  Park 
Los  Angeles,  CA  90007 

1 Dr.  A1 1 an  M.  Collins 

Bolt  Beranek  and  Newman,  Inc. 

50  Moulton  Street 
Cambridge,  MA  02138 

I Cr.  John  J.  Collins 
Essex  Corporation 
63u5  Caninito  Estrellado 
San  Diego,  CA  C-2120 

1 Dr.  Rene  V.  Dawis 
Dept,  of  Psycrolcny 
Lniversity  of  Minnesota 
Mirneapolis,  I'N  S5455 

1 Dr.  Ruth  D--y 

Dxrt.  cf  PSj..-clogy 
Yale  Lriversity 
2 Hi  11  rouse  Averse 
Nev«  Haven,  CT  C6S20 

1 Cr.  Marvin  0.  Cunnette 
Dept,  of  Psychology 
University  of  Minnesota 
Minneapolis,  M'l  5S455 

1 ERIC  Facility-Acquisitions 
4833  Rugby  Avenue 
bethesda,  MD  20C14 

1 Major  I.  N.  Evcnic 

Canadian  Cortes  Perso’-rel 
Ari'  isti  Ff-'rarcn  ' rit 
11G7  Avenue  FoaJ 
Toronto,  Ontario,  CA'.’ADA 


Lr.  Richard  L.  Ferguson 

The  Ai  ericar  Ccltcge  Testing  Progre 

P . 0 . Box  1 Ci. 

Iowa  Citv,  lA  5?r40 

Dr.  Victor  F iclds 
Ce( * . of  Psychology 
ft:,  tgor  ery  Collt  tc 
Rocxville,  r.D  :D80Q 

Dr.  Edv.in  A.  Flcishnjn 
Adwinced  Rescfrch  Resources  Organization 
EbZj  Sixteenth  Street 
Silver  Spring,  FD  20010 

fr.  John  A.  Fi  c JerH-sen 
D.’lt  btrancx  t.  hewman,  Inc. 

SO  Stiett 

Ca>  ji  idCe,  I'-A  03133 

Dr.  Pohert  Glaser,  Co-Director 
Unl.t.rUy  of  Pittsburgh 
39ju  C Hara  Street 
Fitlst '.'rgh,  PA  1S213 

Cr.  !'.  D.  Havror 

Hu’  a''  sc'frces  Pesecrch,  Inc. 

7710  Cid  Spi'i'  ru-ie  Road 
iw-St  r.  ,r  D . -S'.'  ■' ; 1 Park 
FcUii  , v;  221 G1 

Dr.  Durcun  Hansen 
Schorl  of  Education 
Ker,.r1s  State  University 
F'e  '.nis,  TO  ScllB 

Hu"..'^  f.eso'.rccs  research  Crcar.lzaticr. 

400  Plaxc  blcg. 

Pace  Blvd.  at  Fairfield  Drive 
Pensacola,  Ft  3250S 

HucI'.RO/'.'estrrr.  Division 
278'27  lerv, IcF  Drive 
Cartel,  CA  5352? 

Attn:  Library 

Hur.P.in/Colu-bus  Office 

Suite  23,  2601  Cross  Country  Drive 

Colu'.us,GA  31 ''00 

Dr.  Lawrance  D.  Johnson 
Lav.rence  Johnson  i,  Associates,  Inc. 

Suite  502 

2001  S Street  HW 

Washington,  DC  20009 

Or.  f.o  ,r  A.  Kauf'cr. 

203  Dead  Hall 

Flcris,'  Os'ie  Univfrsity 

TaVlahasscs,  Ft  323Cc 

Cr.  Steven  1.'.  F.eele 
Dept,  of  Psycl'olony 
'University  cf  Oregon 
Eugere,  CR  S7JQ3 

Dr.  David  Klahr 
Dept,  of  Psychology 
Carnngle-Mellcn  University 
Pittsburgh,  PA  15213 

Dr.  Alira  E lan*z 
I’llversUy  of  Denver 
Denver  Research  Institute 
Industrial  Econo''.ics  Division 
Denver,  CO  80210 

t.r.  f.  Lassiter 
Data  Solutrons  Ccrp. 

Suite  211,  6849  Old  Oorlrlcr  Drive 
Pci  fan,  VA  22101 


1 Or.  Frederick  K.  Lord  1 

Educational  Testing  Service 
Princeton,  NJ  08540 

1 Mr.  Brian  McNally 

Eoucatlonal  Testing  Service  ' 

frinceton,  NJ  08540 

1 Dr.  Ernest  J.  McCormick 

Department  of  Psychological  Sciences 
Purdue  University 

Lafayette,  IN  47907  1 

1 Dr.  Robert  R.  Mackle 

Human  Factors  Research,  Inc. 

6780  Corton  Drive 

Santa  Barbara  Research  Pari. 

Goleta,  CA  93017  1 

1 Dr.  William  C.  Mann 

University  of  So.  California 
Inforration  Sciences  Institute 
4C76  Adriralty  Way  1 

U.arina  D;1  Pey,  U-  90291 

1 Ur.  Fc;  '.  i i'.  ri.s 
30-  Ciaiige  b'dg. 

Ptnnsylvaiia  Slate  Iniversity  1 

L'niversi'.j  Pari.,  Pn  1CEC2 

1 Dr.  Leo  Djnday 

Houghton  Mifflin  Co. 

P.  0.  Bo>.  1970  , 

Icv.a  City,  lA  52240 

1 Richard  T.  Mowday 

College  of  Business  Adrinistration 
University  cf  Nebraska,  Lincoln 
Lincoln,  NE  6858C 

1 Dr.  Donald  A.  Norran 

Dept,  of  Psychology  C-009 
University  cf  California,  San  Diego 
ta  Jolla,  CA  92D93  1 

1 I'r.  Luigi  Petrullo 
2451  li.  Edgewood  Street 
Arlington,  VA  22207 

1 Dr.  Lyii'an  W.  Porter,  Dean 

Graduate  School  of  Administration 
University  of  California 
Irvine,  CA  92717 

1 Dr.  Diane  M.  Rarsey-Klee 
R-K  Research  & System  Design 
3947  Ridgeront  Drive 
Mali  to,  CA  900CS 

1 K.Eir.  M.  Pauch 
■=  II  4 

burdpsmir.isttriur  oor  Vertcioigurg 
Pnstfach  ICl 
S3  Bonn  1,  CEFTAh'Y 

1 Cr.  Joseph  1.'.  Rigncy 

University  of  So.  Californii, 

Cehavioral  Technology  Laboratcrics 

3717  Soutl.  Grand 

Los  Angeles,  CA  90007 

1 Dr.  Andrew  M.  Pose 

American  Irs'ilutis  for  Pesr.’irh 
loss  Thomas  Jefferson  St.  liV.' 
Washington,  DC  2D037 

1 Cr.  Leonard  L.  Pcser.bajp,  Chairraii 
Dept,  of  Psychology 
I entdO'  cry  Colic  I e 
Rod  vi  1 Ic , f i 2C3S0 


Cr.  Benjamin  Schneider 
Dept,  of  Psychology 
University  cf  Maryland 
College  Park,  M,D  20742 

Dr.  Mark  Reckaae 

University  of  Missourl-Colianbia 

Dept,  of  Educ.  Psychologv 

12  Kill  Hall 

Columbia,  MO  65201 

Dr.  Robert  J.  Seidel 
Instructional  Technology  Group, 
HumiRRO 

300  N.  Washington  St. 

Alexandria , VA  22314 

Dr.  Richard  Snow 
Stanford  University 
School  of  Education 
Stanford,  CA  9430S 

r.r  . Cer-.i  s - . Sul  1 Ivan 
c/u  Canyon  Rescarc*-  Crouo,  Ire. 
3210?  Lir.dero  Carytn  Road 
'.,'estlahe  Village,  CA  91  Jt. 

Dr.  Keith  Wescourt 
Dept,  of  Psycliology 
Stanford  University 
Stanford,  CA  9430'u 

Dr.  Anita  West 
Der.ver  Research  Institute 
University  of  Denver 
Denver,  CC  202C1 

Dr.  Earl  ,',uc: 

Dept,  of  Psycholony 
'University  of  Vashinnooi 
Seattle,  WA  521DS 

Dr.  Thomas  G.  Sticht 
Assoc.  Director,  Basic  Skills 
National  Institute  of  Education 
1200  19th  Street  NW 
Washington,  DC  20206 


